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@ A method of producing a bi-level representation 
of an image comprises (a) making a low-sensitivity 
scan of the image, (b) making a high sensitivity scan 
of the image, (c) subtracting the low sensitivity 
scanned image from the high sensitivity scanned 
image, and (d) performing a binary OR operation 
pixelwise between the high sensitivity scanned im- 
age and the result of the subtraction to form the bi- 
level representation. The invention enables advan- 
tage to be taken of state of the art image subtraction 
methods which are robust to misalignment of the 
images and noise present therein and enables a 
"high dynamic range image" to be generated where 
both dense and faint areas are clearly visible. 
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The invention relates to image processing and, 
more particuiarly, to an improved method of pro- 
ducing a bi-level representation of an image. 

As a first step of any computerized image 
handling process each Image must he scanned and 
digitised. This digitisation is generally, though not 
exclusively, performed by the image scanning de- 
vice which provides a digitised form of the image 
to a computer for storage and/or subsequent pro- 
cessing. 

Modem applications of image processing tech- 
niques, such as the reading of documents for sub- 
sequent feeding to character recognition apparatus, 
require reliable and robust scanning and digitisation 
techniques which will consistently produce an ac- 
ceptable digitised form of a scanned image. 

Image digitisation is normally achieved by di- 
viding the image into small rectangular areas, and 
comparing brightness levels corresponding to each 
area with a threshold parameter, with pels above 
the threshold being set to 1 and pels below set to 
0. Overall image quality depends on the choice of 
threshold and modern Image scanners generally 
.provide a range_of_sensitivity settings. 

One problem with this technique is that it is 
hard to predict which threshold parameter choice is 
optimal. Furthermore, a single choice of sensitivity 
may be inadequate for obtaining good quality of 
the entire image because of variations in the image 
density over the image. 

The only reliable existing solution to the thresh- 
old selection problem is to have manual image 
quality control. An operator must look at each 
scanned image and in case of unsatisfactory qual- 
ity the scan can be repeated with different thresh- 
old parameter setting. Since such a manual pro- 
cess Is labour-intensive and expensive some large 
applications, which otherwise could make use of 
digital image processing techniques, can be ren- 
dered uneconomic by the lack of robust scanning 
techniques. 

One example of an application of image pro- 
cessing techniques where robust scanning and 
digitisation is of paramount importance is the auto- 
mated sorting of mail using optical character rec- 
ognition of digitised address labels or shipping 
information. 

It is the object of the invention to enable bi- 
level scan images to be generated with improved 
image quality. 

To achieve this, the Invention provides a meth- 
od for producing a bi-level representation of an 
image, characterised by the steps of: scanning the 
image one or more times to generate first and 
second bi-level representations of the image (LS, 
HS). each comprising a plurality of pixels, the sec- 
ond bi-level representation (HS) being generated 
with a higher sensitivity than the first representation 



(LS); subtracting the first bi-level representation of 
the image (LS) from the second bi-tevel repre- 
sentation of the image (HS); and performing a 
binary OR pixelwlse between the result of the sub- 

5 traction (D) and the first bi-level representation (LS) 
to produce an output bi-level representation. 

The invention enables advantage to be taken of 
state of the art Image subtraction methods which 
are robust to misalignment of the images and rioise 

w present therein and enables a "high dynamic range 
Image" to be generated where both dense and 
faint areas are cleariy visible. Pages are scanned 
more than once using different sensitivity settings 
and resulting images combined. Such combination 

T5 is done seamlessly so that each image contributes 
its "good areas". 

The particular Implementation described uses a 
subtraction technique originally developed for form 
drop-out applications, and described in EP 0411 

20 231. in order to achieve good quality image sub- 
traction in the presence of scanning noise and 
misalignment between the scanned images. 

The Invention also enables image processing 
apparatus, which can include a scanning device, to 

25 be provided arranged to perform the above meth- 
od. 

An embodiment of the invention will now be 
described, by way of example, with reference to 
the accompanying drawings, wherein: 
30 Rg. 1 IS a general view of the apparatus used in 
the embodiment; 

Rg. 2 is a schematic view of the apparatus of 
Rg. 1; 

Rg. 3 is a flow diagram showing the main steps 
35 of the method; 

Rg. 4 Is a flow diagram Illustrating the subtrac- 
tion method used in the embodiment; 
Rgs. 5, 6, 7 and 8 show an example of the 
application of the method to a particular docu- 
40 ment. 

The embodiment described here involves the 
handling of digitised bi-level images where each 
pixel takes one of two possible values. It will be 
understood that the words 'black* and 'white* and 
45 *1* and '0* as used herein refer to these binary 
states and not necessarily to the actual colour of 
the pixel when displayed or printed. 

The apparatus used In the embodiment of the 
Invention Is shown generally in Rg. 1. It comprises 
50 personal computer 10 which Is connected to scan- 
ning device 20. Rg 2. Is a schematic diagram 
showing the relevant parts of computer 10 and 
scanner 20. Processor 30 is connected via bus 40 
to disk storage device 50, display device 60, user 
65 Input devices - keyboard and mouse - shown gen- 
erally at 70. Processor 30 is connected via bus 40 
and input/output adapter 80 to scanning device 20. 



2 



3 



EP 0 600 613 A2 



4 



Scanning device 20 Is a conventional device of 
a type well known In the art. It is used to scan 
document pages placed therein and pass digitised 
images thereof to computer 10 which are stored In 
storage device 50. Image digitisation is achieved in 5 
scanner 20 by measuring the brightness of light 
reflected from small rectangular areas of the pages, 
the areas being arranged In rows and columns to 
cover the whole page, and comparing the bright- 
ness levels corresponding to each area with a io 
threshold parameter. In the digitised image pro- 
duced by the scanner, pels above the threshold are 
set to 1 and pels below set to 0. The threshold 
parameter can be varied either manually by the 
user or under the control of computer 10 in a ra 
manner which again is well known In the art. 

The method of operation of the apparatus is as 
follows. Two scans are made of the same page 
with different values of the threshold parameters. 
The digitised images which these scans produce 20 
are stored in storage device 50. The images are 
then processed by processor 30 to produce a 
combined Image, also stored in storage device 30. 

^^Overall, the method comprises four stages as 

illustrated in Fig. 3. 25 

a. Low sensitivity (LS) scan. 

The brightness value for each pixel is com- 
pared, in scanner 20. with a first threshold value 30 
and a corresponding pixel value in a first intermedi- 
ate image LS is set to black if the brightness 
exceeds the first threshold. LS is stored in storage 
device 50. 

35 

b. High sensitivity (HS) scan 

The brightness value for each pixel is com- 
pared, in scanner 20. with a second threshold val- 
ue, the second threshold value being lower than 40 
the first threshold value to give higher sensitivity, 
and a corresponding pixel value in a second inter- 
mediate image MS is set to black if the brightness 
exceeds the second threshold. MS is stored in 
storage device 50. 45 

The processing of the images to produce a 
higher quality output image is perfonmed by pro- 
cessor 30 under the control of a suitable program, 
which program is also stored In storage device 50. 
This processing comprises two stages - c and d. so 

c. Subtraction of the low sensitivity image LS from 
the high sensitivity image HS. 

To perform the subtraction, use is made of the ss 
technique described in EP-A-0411231 which has 
become known as form-drop-out. This technique 
was developed as a compression/decompression 



scheme for scanned paper forms which achieves 
high compression ratios by removing template in- 
formation common to all forms of the same type. 
The result of the compression of a form using this 
5 method is a compressed image consisting of the 
filled-ln information only. Such a method ensures 
that the Image when encoded using conventional 
methods, such as run-end or run-length encoding 
will take up less space because the information 
10 content of the compressed form is reduced. 

The inventors have found that this technique 
can advantageously be used in this application 
because it works even when repetitive areas differ 
slightly from each other due to the presence of 
76 scanning noise. 

After a registration phase described in detail in 
EP-0411231, in which the high sensitivity image HS 
is aligned geometrically to fit the low sensitivity 
image LS by optimisation of parameters of a trans- 
20 formation applied to HS. a subtraction phase is 
used in order to locate and remove the low sen- 
sitivity image LS from the high sensitivity image 
HS. The method used is based on NxN pixel 

neighbourhoods 

25 A pixel in the subtraction result image will be 
black if and only if the corresponding pixel in HS is 
black, and the corresponding pixel in LS Is white, 
and either its neighbourhood In the LS is com- 
pletely white, or there exists a black pixel in its 
30 neighbourhood in the result image. The latter can 
be determined by repeatedly applying tests for the 
three conditions until the result image is no longer 
changed. 

However a more efficient one-pass method is 
35 used In this embodiment as shown in Fig. 4. LS is 
first expanded by N pixels to form an Intermediate 
image LSn, ie the operation if Neighbour(LS.i,j,N) 
- white then LSnOJ.) - black else LSn(iJ) « white 
is performed for the pixels (i,j) of the Image, where 
40 Nelghbour(a,l,j,N) returns white if all the pixels a(k.l) 
such that i-N^k^i-i-N and j-N^l^j-t-N are white. 
Next LSn is subtracted from MS to form intermedi- 
ate image A. ie the operation if MS(iJ) - black and 
LSn (i,j)= white then A(i,j) = black else A(ij) = 
45 white is performed. The intermediate image A is 
then expanded, as described above for LS, by N to 
form intermediate image An. Lastly an AND opera- 
tion is performed between An and MS to form 
result image D. 
50 As a result of this subtraction process all image 
features which are visible on LS Image disappear 
from HS; only faint areas will remain for the final 
stage. 

55 
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d. Combining drop-out image D with LS Into a 
single output image. 

This is achieved by "OR-ing" both images 
pixelwise, ie a pixel in the output image will be 
black if it Is black in either LS or D. Note that all 
the black and dense areas appear on both LS and 
HS images. Hence, in D these areas are white and 
will coincide in LS and the output image. The 
effective result is as if dense areas were scanned 
using low sensitivity. However, the faint areas will 
appear on HS but not on LS. Therefore these areas 
will have migrated from HS to D. Accordingly, as 
desired, the output image will also include faint 
areas as they have appeared In HS. 

For example consider the image depicted in 
Rg. 5. The Image of Rg. 1 was created by scan- 
ning a page of an IRS tax return using low sensitiv- 
ity. Printed areas, including small print, are clear 
and legible but handwriting has almost disappear* 
ed. The handwriting can be made to appear by 
Increasing scan sensitivity with the result depicted 
in Rg. 6. Now, handwriting is clear but dense areas 
are badly distorted. Both the Images of Rg. 5. and 
Rg 6. are unsatisfactory, each for its own reasons. 

After subtracting the image of Rg. 5 from the 
image of Fig. 6 as described above, the difference 
image of Fig. 7 is obtained. Combining images of 
Rg. 5 and Rg. 7 yields the Image of Fig. 8 where 
both dense and faint areas are clearly visible. 

The form drop-out technique as described in 
EP-A-0411231 is used with advantage in the em- 
bodiment, but it will be understood that other sub- 
traction methods of a similar nature could also be 
used. 

The embodiment has been described in terms 
of a suitably programmed general purpose com- 
puter used in conjunction with a conventional scan- 
ning device but, in practice, It will be understood 
that the invention could be Implemented in hard- 
ware, either as part of the scanner or as a special 
purpose adapter for use with the computer or a 
stand alone hardware device or be implemented as 
any combination of hardware and software. 

In this embodiment two scans of the same 
image are made one after the other but equally it is 
possible to work with a scanner having two dif- 
ferent sensors so that only a single scan would be 
necessary, the processing potentially being per- 
formed dynamically during the scan. It will be fur- 
ther understood that the method of the invention 
could be used to produce a bi-level image from a 
greyscale image by scanning the greyscale image 
and comparing the greyscale pixels with different 
threshold values to produce the HS and LS images. 
The term "brightness" as used herein could thus 
refer to a stored greyscale value. 



In this embodiment it has been found that only 
two sensitivities suffice to produce a good quality 
result but, if necessary, similar ideas can be ex- 
tended to the combination of scans with 3 or more 
5 sensitivity settings. 

Claims 

1. Method for producing a bi-level representation 
10 of an image, characterised by the steps of: 

Scanning the image one or more times to 
generate first and second bi-ievel representa- 
tions of the image (LS, HS). each comprising a 
plurality of pixels, the second bl-level repre- 
ss sentatlon (HS) being generated with a higher 
sensitivity than the first representation (LS); 

Subtracting the first bi-level representation 
of the image (LS) from the second bi-level 
representation of the image (HS), 
20 Performing a binary OR pixelwise between 

the result of the subtraction (D) and the first bi- 
level representation (LS) to produce an output 
bi-level representation. 

26 2. A method as claimed in claim 1 wherein the 
generation of the first and second bi-level re- 
presentations comprises comparing the bright- 
ness value for each pixel with a threshokl 
value and setting a con^esponding pixel value 

30 in the representation to black if the brightness 
exceeds the threshold, the threshold being 
used for generating the second representation 
(HS) being lower than the threshold used to 
generate the first representation (LS). 

3. A method as claimed in claim 1 or claim 2 
wherein the subtraction step comprises deter- 
mining for each pixel whether the pixel is an 
added pixel and, if so, assigning the pixel to 

40 be black in the subtraction result (D), wherein a 
pixel is an added pixel if the corresponding 
pixel in the first bi-level representation (LS) Is 
white and the corresponding pixel in the sec- 
ond bi-level representation (HS) is black and 

45 either the neighbourhood of the conresponding 
pixel in the second bi-level representation (HS) 
is black or there is a pixel in the neighbour- 
hood of the pixel which is also an added pixel. 

50 4. Method as claimed in claim 3 wherein the step 
of determining for each pixel whether the pixel 
is an added pixel comprises the steps of: 
(i) expanding the first bi-level representation 
(LS) by setting to black each pixel therein 
55 which has a black neighbour in a distance 

less than or equal to N to fomn an expanded 
representation (LSn); 
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(il) subtracting the expanded representation 
(LSn) pixel-wise from the second bi*level 
representation (HS) to form a third bi-level 
representation (A); 

(iii) expanding the third bi-level representa- 5 
tion (A) by setting to blacl< each pixel there- 
in which has a black neighbour in a distance 
less than or equal to N to form an expanded 
third bi-level representation (An); and 
Crv) performing a binary AND pbcelwise be- ro 
tween the expanded third bi-level represen- 
tation (An) and the second bi-level repre- 
sentation (HS) to form an image In which 
the black pixels are added pixels. 

76 

5. A method as claimed in any preceding claim 
wherein the first and second bi-level images 
are generated from a greyscale representation 
of the image. 

20 

6. Image processing apparatus arranged to per- 
form a method as claimed in any preceding 
claim. 



7. Image processing apparatus as claimed in 25 
claim 6 comprising a scanning device (20) for 
scanning a document to produce the first and 
second bi-levet representations (LS, HS). 

30 
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